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Abstract 

COVID-19 is an infectious disease caused by a novel coronavirus called SARS-CoV-2. 
The first case appeared in December 2019, and until now it still represents a significant 
challenge to many countries in the world. Accurately detecting positive COVID-19 
patients is a crucial step to reduce the spread of the disease, which is characterized 
by a strong transmission capacity. In this work we implement a Residual Convolutional 
Neural Network (ResNet) for an automated COVID-19 diagnosis. The implemented 
ResNet can classify a patient’s Chest-Xray image (CXR) as COVID-19 positive, pneumonia 
caused from another virus or bacteria, or healthy. Moreover, to increase the accuracy 
of the model and overcome the data scarcity of COVID-19 images, a personalized 
data augmentation strategy using a three-step Bayesian hyperparameter optimization 
approach is applied to enrich the dataset during the training process. The proposed 
COVID-19 ResNet achieves a 94% accuracy, 95% recall, and 95% F1-score in the test 
set. Furthermore, we also provide insight into which data augmentation operations are 
Licencia Creative Commons successful in increasing CNN performance when doing medical image classification 
Atribucién-NoComercial 4.0 with COVID-19 CXR. 


€5) ® eS) Keywords: convolutional neural networks, deep learning, Bayesian optimization, medical 
image analysis, data augmentation, hyperparameters 
ae Resumen 
E Beaty. La COVID-19 es una enfermedad infecciosa causada por un nuevo coronavirus llamado 


Oihane Fernandez SARS-CoV-2. El primer caso aparecid en diciembre del 2019 y hasta el momento sigue 
representando un gran desafio a nivel mundial. La deteccién precisa del virus en 
Recibido / pacientes COVID-19 positivos es un paso crucial para reducir la propagacidn de esta 


Cees enfermedad altamente contagiosa. En este trabajo se implementa una red neuronal 
residual convolucional (ResNet) para el diagndstico automatizado de la COVID-19. La 

tre ead ResNet implementada puede clasificar la radiografia del trax de un paciente en COVID-19 
07/10/2021 positivo, uno con neumonia causada por otro virus o bacteria, y un paciente saludable. 
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utilizando una optimizacidn bayesiana en tres pasos. La ResNet propuesta alcanza un 94% 
de precisién, 95% de sensibilidad y 95% en el F1-score en el set de prueba. Adicionalmente, 
presentamos las operaciones de aumento de datos que ayudaron a incrementar el 
rendimiento de la red neuronal y que pueden ser utilizados por otros investigadores en el 
desarrollo de modelos para la clasificacidn de imagenes médicas. 


Palabras clave: redes neuronales convolucionales, aprendizaje profundo de maquina, 
optimizacién bayesiana, andlisis de imagenes médicas, aumento sintético de datos, 
hiperparametros 


INTRODUCTION 


COVID-19 is a disease caused by a new coronavirus called SARS-CoV-2. The first reports 
of this new virus came from Wuhan, Republic of China. The most common symptoms 
associated with COVID-19 are fever, fatigue, and dry cough. Meanwhile, the most severe 
symptoms include shortness of breath, confusion, pressure in the chest and loss of 
appetite [1]. In comparison with severe acute respiratory syndrome (SARS) and Middle 
East respiratory syndrome (MERS), SARS-Cov-2 is characterized by a lower mortality rate 
but a stronger transmission capacity [2]. 


The initial screening methods for COVID-19 diagnosis are the Real-Time reverse 
transcription-Polymerase Chain Reaction test (RT-PCR), antibody test (serology), and 
auxiliary diagnosis tests like computed tomography (CT) and chest X-ray (CXR) [2]. The 
RT-PCR test is recommended for people who have shown symptoms of the Coronavirus, 
people who have been in contact with a confirmed case, and people who have been 
traveling or participating in social events [3]. On the other hand, the antibody test is 
suggested for people who believe they have had the virus in the past, as it looks for the 
antibodies in the blood [4]. 


The laboratory diagnosis methods (RT-PCR and antibody test) have two important 
drawbacks: (1) a low viral load conducts to a low detection rate that can lead to false- 
negative results, and (2) the viral tests show positive/negative results but cannot judge 
he COVID-19 evolution in the chest [5]. In contrast, CT imaging and chest X-ray can be 
used to detect and measure the severity of the virus. In China, CT is widely used as a first 
ine investigation method in patients with COVID-19 [6] and is recommended as the 
basis method for COVID-19 diagnosis [7]. However, CT practice implies a high demand 
for radiology departments and, most importantly, a decontamination of the equipment 
o reduce the risk of cross-infection, hence reducing the availability and applicability of 
he method [8]. 


Chest X-ray is one of the most common diagnosis methods for lung disease due to 
its accessibility and rapid analysis. Moreover, CXR is not as expensive as CT and does 
not require a previous preparation of the patient [9]. Therefore, CXR analysis can be 
applied to help medical professionals diagnose COVID-19 in patients. However, imaging 
diagnosis does come with its own complications, such as the difficulty of accurately 
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distinguishing COVID-19 pneumonia from other forms of pneumonias caused by the 
cytomegalovirus, adenovirus, influenza A virus, influenza B virus, MERS, and other viral 

avances 


encerase and bacterial pneumonias [2]. Consequently, this can lead to a COVID-19 diagnosis delay. 
According to the Radiological Society of North America, there are some characteristic 
manifestations on the CXR of a patient with Coronavirus [10]. These manifestations 
include the consolidation in the peripheral and mid to lower zone distribution, and the 
presence of bilateral patchy, bandlike ground-glass opacity [11]. In Fig. 1, we present 
three examples of CXR images. Image A shows a healthy patient, image B a person 
infected with COVID-19, and image C a person with pneumonia [12]. 


Figure 1. Examples of CXR images taken from [12] A healthy patient, B patient infected with COVID-19, and 
C patient with pneumonia 


Despite the similarities between different types of pneumonia, it is possible to optimize 
the COVID-19 diagnosis through machine-learning (ML) techniques. Over the years, ML 
methods have shown reliable results in the analysis of medical images [13]. Deep learning 
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architectures, like convolutional neural networks (CNNs), have especially demonstrated 


achievement in state-of-the-art performance in medical image classification [14, 


5] and 


segmentation [16, 17, 18] tasks. As Gu et al. mentions, automated methods for medical 


diagnosis can play an important role for future diagnostic procedures, especial 


y with 


the exploration of new CNN architectures that can improve the algorithm performance 
[19]. In the COVID-19 context, CNNs can recognize visual patterns from CXR images of 


COVID-19 patients and aid in the diagnosis, providing a rapid response and relievi 


demand for radiology experts. 


CNNs have obtained remarkable success by automatically optimizing its mill 
ess, one O 
tting the training dataset if a smal 


parameters using labeled 
that such large-capacity t 
dataset is provided. This is 
labelled data is very expe 
an expert radiologist. Fur 
sufficient images can be 
has been presented to ex 
strategies have been 


training data. Neverthe 
ainers are prone to overfi 
he case of most medical i 
nsive and time-consumin 
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9 automated 
image (CXR) 
a, or healthy. 
data scarci 
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hermore, 


corresponding data augmentation 
on accuracy for COVID-19 detection. 


analysis of which data augmentation operations favor the 
ification, which will help researchers 


determine which operations to apply in their own models and specific problems. 


LITERATURE REVIEW 


Medical imaging over the years has been essential for the visual representation of tissues 
and organs. Many imagining formats have been created such as Magnetic Resonance 
Imaging (MRI), X-ray, Computed Tomography (CT), and others [22]. The evaluation of a 
medical image is usually a manual and costly process, taking considerable time for a 
radiologist or medical expert to inspect all the slices and imaging modalities. A successful 
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approach to shorten the inspection and evaluation time is to develop automatic models 
that extract the most important features from the images [23]. 


The first COVID-19 case appeared in December 2019, and until now the virus still presents 
a significant challenge to many countries in the world. To facilitate the diagnosis of 
COVID-19, researchers have focused on developing machine learning and deep learnin 
methods that accelerate the detection of the virus in a patient’s medical images. Th 
is especially necessary since identifying the presence of the virus can help reduce the 
transmission to uninfected individuals. 


wn OQ 


Various methods have been presented to automatically identify the disease using chest 
X-ray images, CNNs being the most used because of its high accuracy and precision. Wang 
et al. proposed a two-part classification method by applying a pre-trained inception 
network to convert the image data into one-dimensional feature vectors, followed by a 
fully connected network to produce the classification prediction. The study reports an 
accuracy of 79.3% on an external testing dataset [24]. ln another study, Sethy et al. uses a 
wo-step method to classify X-ray images. First, they apply a deep learning architecture 
in the first layers to extract deep features from the image, and then they implement a 
support vector machine in the last layer to perform the classification. Using a ResNet50 
as the deep learning architecture, they report an accuracy of 95.38% on a public dataset 
25]. Xin et al. proposed an evolutionary multi-objective neural architecture search 
method, also known as EMARS-A, to automatically find the architecture of a CNN for 
COVID-19 classification. Their network achieves an accuracy of 89.67% [26]. On the other 
hand, Narin et al. developed a model that used five pre-trained convolutional neural 
networks based on the ResNet and Inception models and tested them in three different 
binary datasets. An accuracy of 96.1% is obtained in dataset 1, 99.5% in dataset 2, and 
99.7% in dataset 3 [27]. 


METHODOLOGY 


In this paper, we employ a convolutional neural network to distinguish CXR images from 
patients with COVID-19, patients with pneumonia caused by other infections, and healthy 
individuals. A Residual Convolutional Neural Network (ResNet) [21] is selected because 
of its capability to extract high level and complex features, which is necessary for the 
complicated task of medical image recognition. Moreover, to improve the classification 
accuracy and sensitivity of the model, the technique of data augmentation is extensively 
applied during training to increase the number of training samples, reduce overfitting, 
and increase the generalization capability of the model. To take the most advantage of the 
data augmentation method, the magnitude of the augmentation operations is selected 
using a Bayesian hyperparameter optimization approach. In the following subsections we 
present the residual neural network applied, describe the data augmentation method, 
and conclude by presenting the Bayesian hyperparameter optimization approach used. 


Residual neural network 


Neural network depth has a strong influence on the accuracy of a network. As a 
network becomes deeper, it is more capable of recognizing and modelling the complex 
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intricacies of the images. However, when deep networks are being trained, it is difficult 
to propagate gradients especially to the deeper layers, giving rise to the gradient 
degradation problem. Gradient degradation causes information loss and reduces the 
accuracy of the model [21]. The residual network (ResNet) is a very deep network that 
implements a residual function to counteract the degradation problem. The residual 
function, also known as residual connection, is implemented through a summation 
function as shown in equation 1: 


f(x) +x (1) 


where the network layers are represented by f(x), and is the input feature map to the 
first layer. The residual connection adds a connection between the input to the layer and 
the output of the stacked layers, thus allowing the information to flow directly between 
network layers during the forward propagation and, most importantly, permitting the 
gradients to pass during backpropagation. In Fig. 2, the basic residual blockimplemented 
in the COVID-19 ResNet classification architecture is presented. 
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Figure 2. Basic residual block for the COVID-19 ResNet architecture. Each residual block has a residual connection 
that adds the input image/input feature of layer to the output of the two convolutional blocks producing the 
new input feature for layer 


DOI: https://doi.org/10.18272/aci.v13i2.2288 


Articulo/Article 
Seccion/Section C 


Vol. 13, nro. 2 
ID: 2288 


avances 
en ciencias e 
Ingenierias 


COVID-19 ResNet: Residual neural network for COVID-19 classification with three-step Bayesian optimization 
Balseca / Cruz / Baldeon (2021) 


The residual block, denoted as y, is composed of two convolutional blocks that 
represent f (x). Each convolutional block is comprised of a batch normalization layer 
(BN), a ReLU activation function, and a convolutional layer. A ReLU activation function is 
implemented because it has shown to solve the vanishing gradient problem. Moreover, 
a BNis also included to reduce the internal covariance shift and to normalize the output 
of each layer. The size of the convolutional kernels is a hyperparameter that is optimized 
with the three-step Bayesian Optimization. The input feature to layer forms a residual 
connection with the transformed input f (x) through a summation operation, forming 
the input x,,, to layer/+ 1 (x, =y, = f(x) + x). Moreover, in residual blocks where the 
output features have a different size than the input features of the following residual 
block, an extra convolutional layer is included before the residual connection. Hence, 
depending on the location of the residual block in the network, it might have two or 
three convolutional layers to guarantee the correct flow of information. 


The entire ResNet architecture implemented is presented in Fig. 3. The input to the 
network is a 2D CXR image with shape 2242241. Thirty-two convolutional layers divided 
into 14 residuals blocks make up the body of the network. The number of filters in the 
residual blocks increases progressively from 64 to 512. In detail, the ResNet structure has 
two residual blocks with 64 filters, 10 residual blocks with 128 filters, 10 residual blocks 
with 256 filters, and 2 residual blocks with 512 filters. On the other hand, as the number 
of filters is increased, the size of the feature maps is reduced by half. Fig. 3 presents the 
residual blocks grouped according to their number of filters and dimensions. In the last 
layers, the ResNet applies an average pooling followed by a flattened layer to compress 
the features maps into a vector of size 25,088. Finally, a fully connected layer with 3 
neurons and a softmax classifier is used to predict the probabilities of an image being 
part of each of the three classes (COVID-19 positive, pneumonia caused by other bacteria 
or virus, and healthy). Our network has a total of 15,676,549 trainable parameters which 
are optimized using the stochastic gradient descent method. 


Residual Blocks 


aA 


~ 


Input size: 224x224 Input size: 224x224 
tay Lae Input size: 112x112x64 Average Pooling 
Input size: 56x56x128 


Dense [3] 


Input size; 28x28x256 


Output size: 7x7x512 


Output size: 28x28x256 
Flatten 


Output size: 56x56x128 [25 088] 


Output size: 112x112x64 


Figure 3. The COVID-19 ResNet architecture. Each blue square represents a residual block. The size of the input to the 
residual block is located on top of the residual blocks, and the output size on the bottom. Each residual block contains 
two convolutional blocks with a batch normalization layer, ReLU activation function and a convolutional layer. 


Data augmentation 


Neural network models can be quite successful when a huge amount of data is available. 
Nevertheless, in the medical field, acquiring data is costly and sometimes unavailable, 
prohibiting the quantity of images obtained. Hence, CNNs tend to overfit the training 
dataset due to the considerable amount of model parameters that need to be fitted. 
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This issue is aggravated by the fact that the COVID-19 outbreak is recent, so the number 
of publicly available images are limited. 


Data augmentation isa method that increases the diversity of the training set by extending 
it artificially through the application of affine and elastic random transformation to the 
original images. The most common data augmentation operations are image rotation, 
reflection, horizontal and vertical shifting, color adjustments, and scaling. In this work, 
the data augmentation operations tested for training the COVID-19 ResNet are rotation, 
scaling, width shift, height shift, vertical flip, and horizontal flip because they have 
shown to increase the accuracy of neural networks. A description of the tested data 
augmentations is presented next, and an example of these transformations in a CXR 
image is shown in Fig. 4. 


Original Image Width Shift Heigth Shift 
Ri <= ‘ 4 


Oiriginal Image 
2 ee 


© 


Figure 4. Data augmentation operations applied to a CXR image in the training dataset 


Rotation is an affine transformation that rotates an image J by an angle @ around the 
: : A f _(cos@ —sin®@ 
center pixel. It is applied through the matrix R= oe sie ). 

Scaling scales an image | in the horizontal or vertical direction by applying the affine 
transformation S= G é ) , where s, and s, are the scaling factors in x and y respectively. 
Fs 4 bp s : o 
This operation helps the neural network learn from different shapes and sizes of the region 


of interest [28]. 
Width and Height Shift shifts the image | by a given number of pixels on the horizontal 


or vertical axis. Since the area of interest can be shifted to different regions in the image, 
it forces the neural network to learn spatially invariant features. 
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Horizontal and Vertical Flip creates a reflection of the original image | along the horizontal 
or vertical axis. A flip on the horizontal axis swaps the right and left hemisphere. A vertical 
flip swaps the up and down section of the image. In natural images, only horizontal flips 
create real images. However, in medical images vertical flips have also shown to produce 
realistic images [28]. 


Bayesian hyperparameter optimization for data augmentation 


Determining the data augmentation operations that need to be applied in our specific 
problem, and to what extent, is a difficult task because of the many possible combinations 
hat can be tested and their effect on the generalization and accuracy of the model. The 
correct value of these operations could be set manually by considering the values found 
in other publications [29]. However, data augmentation strategies successfully applied 
in a dataset may not transfer as effectively to another dataset due to particularities of 
each dataset and model. Hence, the data augmentation operations explained in section 
3.2 (rotation, scaling, width shift, height shift, vertical flip, and horizontal flip) will be 
reated as hyperparameters in our model and their optimal values selected using a 
hree-step Bayesian hyperparameter optimization approach. The search ranges for these 
hyperparameters are initially set to the maximum allowable value to try to cover the 
whole search space. 


Furthermore, four more hyperparameters related to the model and training process 
are added to the hyperparameter search. These hyperparameters are the learning rate, 
batch size, number of training epochs, and kernel size. Setting the correct learning rate 
is critical because a rate that is too large will converge very fast to a suboptimal solution, 
whereas a rate that is too small will halt the training process. There is no rule of thumb 
for this hyperparameter because it depends on the specific neural network architecture. 
However, Bengio mentions that a good initial learning rate is less than 1 and greater 
han 1e-6 [30]. 


To mitigate overfitting and underfitting, a good number of training epochs must be 
chosen. The number of epochs depends on the size of the dataset. Given that the 
dataset used in this work has a limited number of images, the range of the number of 
raining epochs is set from 200 to 300. 


asters et al. recommends mini-batch sizes as small as two or four to improve network 
accuracy [31]. Considering these recommendations and the computational limitations, 
he range of the batch size is set between one and three. All the hyperparameters being 
optimized are shown in Table 1. 


na hyperparameter optimization problem, the objective is to find the hyperparameter 
values that minimize the validation cross-entropy loss function. Hence, the nine 
hyperparameters presented represent the decision variables, and their search range 
corresponds to the search space. A Bayesian optimization approach [32] is applied to 
solve this problem because it has proven to be effective in solving non-linear and non- 
convex optimization, and it reduces the search time through the application of Gaussian 
Process surrogate function. 
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The hyperparameters are optimized applying a three-step optimization process, in which 
the hyperparameter search bounds are adjusted progressively to reduce the algorithm 


search space and focus on th 
apply the Bayesian optimiza 


e most promising search areas. Specifically, in each step we 


ion approach, with a Gaussian Process surrogate function 


and an Expected Improvement acquisition function, in the defined search bounds and 


find the best current solution. In the next step, we redefine th 


tighter to the previously found solutions. The hyperparam 


e bounds by making them 
eters and their respective 


search bounds in each step are presented in Table 1. Twenty initial points were collected 


using a Latin hypercube sampling to approximate the Gaussi 


an Process before running 


each optimization. The termination criteria for the optimization are 30 iterations or a 


distance of 1e® or less between two consecutive points. 
process is implemented usin 


g the GPyOpt library [33]. 


The Bayesian optimization 


Table 1. Hyperparameters and search bounds optimized in each of the three steps of the Bayesian optimization 
for the COVID-19 ResNet 


Hyperparameters 


Rotation 


Zoom 


Width shift 
Height shift 


Horizontal flip 


Vertical flip 


Learning rate 


Batch size 


Min 


Kernel 
Epochs 


imized Loss 


Step 1 


0 degrees — 
359 degrees 


-1.0-1 
=40= 1 
-10-1 
True (1) — False (0) 
True (1) - False (0) 
3e4—1e-5 
1,2 and3 
1,3 and'5 
200-300 
0.222 


Step 2 


0 degrees - 9 
0 degrees 


0.0-0.5 
0.0 - 0.5 
0.0-0.5 
True (1) — False (0) 
True (1) - False (0) 
3e-4 - 1e-5 
1,2 and3 
1,3 and 5 
200-300 
0.150 


Step 3 


68.28 degrees — 
240.41 degrees 


0.34 - 0.37 
019035 
-0.13-0.15 
True (1) — False (0) 
True (1) - False (0) 
3e-4—1e-5 
1,2 and 3 
1,3 and5 
200-250 
0.107 


Finally, using the Bayesian Optimization approach to obtain the best hyperparameters 


requires high computational resources. Given our computational limitations, we app 
a principal component analysis (PCA) to reduce the dimensionality of the data. On 
hundred forty principal components per image are selected, which capture 95% of the 
iance of the originally 50,176-dimensional image. Therefore, the hyperparameter 
optimization is performed with the 140-dimensional reconstructed images. Applyin 


otal var 


he PCA 
original i 


informa 


input fo 


reduction decreases drastically the RAM requirements from 15.51 GB, when the 
mages are used, to 2.79 GB when the reconstructed images are applied. Despi 
PCA being an excellent approximator, using the reconstructed images can miss valuab 


y 
e 


g 


e 


e 


ion about the original input. In consequence, we only use these reconstructed 


the model training and testing. 
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This study is performed in a publicly available dataset [12] composed of 420 2D CXR 
images in the Posteroanterior (PA.) chest views. The images are classified into three 
categories: COVID-19 patients (140 images), healthy or normal individuals (140 images), 
and patients with pneumonia caused by bacteria or a virus different from COVID-19 (140 
images). The dataset is available in the GitHub repository of the authors https://github.com/ 
abzargar/COVID-Classifier.git 


The dataset is split into 70% images for training, 15% images for validation, and 15% 
images for testing. A simple random sampling method is applied to select the images for 
each set, making sure that each category is equally presented in each set. The number of 
images per category and set are presented in Table 2. 


Table 2. Number of CXR images per category for training, validation, and testing 


Category 
COVID-19 Normal Pneumonia 
Training 98 98 98 
Validation 21 Zl 21 
Test 21 21 21 
Total 140 140 140 


Preprocessing operations 


The main motivation for image preprocessing is to standardize and enhance the image 
to facilitate the feature extraction. The preprocessing steps applied to the images 
are gray scale conversion, resizing, and adaptive histogram equalization. An original 
image and its preprocessed counterpart are shown in Fig. 5. The preprocessing steps 
are explained next. 
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A) B) 


Figure 5. Examples of a preprocessed CXR image A) Non processed Normal CXR, B) Processed Normal 224x224 CXR 


Gray scale conversion 

Chest Xray-images contain considerable noise, such as blurring, fog, low contrast, and 
unwanted information. Gray scale conversion is a widely used technique to reduce the 
noise and computational cost of an image [34, 35]. This conversion transforms the color 
values of the original images (24 bit) represented in three dimensions XYZ (lightness, 
chroma and hue) into grayscale images represented only by the luminance (8 bit). The 
processed grayscale image has pixel values in the range of 0 (black) to 255 (white). 


Resizing 

The size of the images in the original dataset varies, so to prevent compilation errors 
during the preprocessing operations and model training, they are resized to a fixed 
224x224 pixel size. 


Adaptative histogram equalization 

Histogram equalization is a contrast enhancement method that applies to each pixel 
in the image a mapping based on the surrounding pixels. A study made by Sherrier et 
al. shows that adaptative histogram equalization applied to chest radiography allows 
certain regions of the CXR to be enhanced differentially [36]. 


Training 
The ResNet is trained with a cross-entropy loss function and the Adam optimizer with the 
parameters recommended in [37]. The weights are initialized from a gaussian distribution 


centered on 0 with a standard deviation of aa where fin is the number of input 
units in the weight tensor, and four the number of output units in the weight tensor. The 
number of training epochs are selected based on the results of the Bayesian optimization, 
as well as the data augmentation operations and its values. This information will be 
presented in the next section. 


The code was developed using Google Colaboratory Professional, with a high-RAM 
hosted runtime with 27.4 gigabytes available and 147.16 gigabytes of available disk 
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space. The programming language used was Python V3.0, and the neural network 
implemented with the Keras library. The code of our work can be found in a Github 
repository using the following link: https://github.com/titulacion2021/Image-Classification-ResNet 


RESULTS 
Bayesian optimization results 


The best hyperparameter values found in each step of the optimization process, their 
respective loss and validation accuracy are presented in Table 3. In the first step, the 
Bayesian optimization took 15.2 hours to find the best hyperparameters, reached a 0.22 
loss, and had a 90% validation accuracy. The second step took 11.19 hours, achieved a 
92% validation accuracy, and had a 0.15 loss. In the last step, the optimization converged 
in 5.4 hours, reached the highest validation accuracy of 97%, and had a loss of 0.10. The 
hyperparameters found in the third step are the hyperparameter values used to train 
and test the COVID-19 ResNet. 


Table 3. Best hyperparameter values found in each step of the three-step Bayesian hyperparameter 
optimization. The best hyperparameters found in step 3 are used to train the COVID-19 ResNet. 


Hyperparameters Step 1 Step 2 Step 3 
Rotation 240.41 degrees 64.28 degrees 187.5 degrees 
Zoom 0.34 0.37 0.36 
Width shift 0.36 0.19 0.23 
Height shift -0.13 0.16 -0.13 
Horizontal flip True (1) True (1) True (1) 
Vertical flip False (0) True (1) False (0) 
Learning rate 0.0003 0.00025 2e-4 
Batch size 3 2 3) 
Kernel 1 3 ] 
Epochs 200 300 200 
Minimized Loss 0.223 0.150 0.107 
Validation Accuracy 90% 92% 97% 
Furthermore, to test the adequacy of the three-step Bayesian Optimization, we compare 
the COVID-19 ResNet against two competing models. First, we train the ResNet 
architecture using the data augmentation values recommended in the literature [38]. 


The hyperparameter values tested and the corresponding validation accuracy are 
presented in Table 4. A 95.0% validation accuracy is reached, which is 2% less than the 
accuracy obtained using the three-step optimization approach. Considering the need 
to accurately diagnose COVID-19 in a patient, a 2% increase is an important gain. 
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Table 4. Hyperparameters tested based on literature and the validation accuracy achieved 
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Rotation 
Zoom 
Width shift 
Height shift 
Horizontal flip 
Vertical flip 
Learning rate 
Batch size 
Kernel 
Epochs 


Validation Accuracy 


Hyperparameters Values 


45 
0.2 
0.2 


200 
95% 


Secondly, we implement the traditional Bayesian Optimization method to optimize 


the hyperparameters with the wide ranges se 
for 90 iterations, which is the total number of i 


up in step 1 of Table 3 and let it run 
erations run in the three-step Bayesian 


optimization method presented. The best hyperparameters found with the traditional 


Bayesian optimization and the validation accuracy achieved a 
A 95.2% validation accuracy is attained, which shows that the 
method improves by approximately 2% validation accuracy. Fu 
step Bayesian optimization method is especially applicable when 


e shown in Table 5. 
proposed three-step 
thermore, the three- 
here is atime limit on 


the use of the computational resources, as each step takes less time running than the 
traditional Bayesian Optimization for one long iteration. 


Table 5. Hyperparameters found with the traditional Bayesian Optimization and validation accuracy achieved 


Hyperparameters Values 


Rotation 
Zoom 
Width shift 
Height shift 
Horizontal flip 
Vertical flip 
Learning rate 
Batch size 
Kernel 
Epochs 


Validation Accuracy 
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Classification results 


In this subsection we present the classification results obtained by the best ResNet (the 
ResNet trained with the optimal hyperparameter values obtained in 4.4.1) in the test set. 
The metrics used for the evaluation are the accuracy, precision, recall, and F1-score. The 
metrics are defined in equations 2-5. 


TN+TP 
Accuracy = —————_ 2 
TN+TP+FN+FP 
os ti 
Precision = —— 3 
TP+FP 
TN+TP 
Recall = ——___ 4 
TN+TP+FN+FP 
Precision*Recall 
Fil = ——————- 42 5 


Precision+Recall 


The evaluation metrics in the test set are presented in Table 6. The accuracy reached by 
he model is 94%, with 59 correctly classified samples out of 63. In terms of the precision, 
he model correctly detects the positive COVID-19 and pneumonia cases with a high 
95%. Meanwhile, the normal cases have a lower positive detection of 91%. Analyzing the 
recall, in the normal class 100% recall is obtained, while for the COVID-19 class 95%, and 
for the pneumonia class 86%. Recall aims to minimize the false negatives, hence having 
a 100% recall in the normal class means the model perfectly distinguishes patients with 
pneumonia caused by COVID-19 or another virus and bacteria from healthy individuals. 
Finally, the Fl-score provides a harmonic mean between the precision and recall. All 
classes have an Fl-score greater than or equal to 90%, which means that the model 
provides an adequate balance between precision and recall. 


Table 6. Classification results for each category with the COVID-19 ResNet in the test set 


Precision Recall F1-Score 


Pneumonia 


Accuracy 


Metrics 


Benchmark comparison 
To compare the performance of our network, in Table 7 we present the results of our 


work and highly cited studies that perform COVID-19 image classification. Furthermore, 
to provide a better background of each study, the preprocessing techniques and 
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a popular architecture, but what distinguishes ours is the application of the Bayesian 
avanees. Optimization to find the best data augmentation and model hyperparameters. The 
ve accuracy of the proposed COVID-19 ResNet has a competitive accuracy, being ranked 

third against the other works. It is also ranked second in terms of sensibility. 


= 
i architecture implemented by each work are included. As can be seen, the ResNet is 


Table 7. Evaluation metrics of the implemented COVID-19 ResNet (*) and competing state-of-the-art models on 
COVID-19 image classification. A dash means the authors have not computed the metric. 


Model Architecture rocessing Accuracy Recall F1-score 
(%) 2) (%) 
Maghdid et 5 = 
al. [39] AlexNet Cropping, resizing 98 - 7 
Esaooietal Rescaling, data 
¢ an oe ResNet50 augmentation, 96.23 100 100 
normalizing 
covD-9 le 
ResNet ResNet cheat 94 95 95 
: color adaptative, 
(Ours) : i 
reduce dimension 
Wang et al. COVID-Net Rescaling, data 933 91 - 
[41] augmentation 
Segmentation, 
Wu et al. [42] ResNet50 Rescaling, Multiview 76 81.1 - 
Fusion 


DISCUSSION AND CONCLUSIONS 


The aim of this work was to develop a CNN to automatically classify CXR images into 
COVID-19 positive, pneumonia caused by a bacteria or a virus other than COVID-19, or 
healthy. A ResNet architecture was implemented due to its capability to extract complex 
short- and long-range features, while preventing the gradient degradation problem. The 
network was trained in a publicly available dataset composed of 420 2D chest X-rays. 
To overcome the challenges of working with a small dataset, a data augmentation 
technique was applied. Six data augmentation operations where tested, namely 
rotation, scaling, width shift, height shift, horizontal flip, and vertical flip. Furthermore, 
the learning rate, kernel size and batch size of the model were also optimized for this 
problem. The best values for these nine hyperparameters were selected using a three- 
step Bayesian optimization approach. 


In each step of the three-step optimization process, the Bayesian hyperparameter 
optimization algorithm minimized the validation cross-entropy loss while searching 
for the best hyperparameter values. In the initial step, the search bound for the 
hyperparameters was set to the maximum range allowed by the operation. In next steps, 
the bound was progressively adjusted around the best solutions found so far to reduce 
the feasible space and exploit the search region. The strategy showed to provide good 
results, as the best solutions found in each step kept increasing the validation accuracy 
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and reducing the cross-entropy loss function. Moreover, while the ResNet with the 
recommended hyperparameter values found in the literature reached a 95% validation 
accuracy, the ResNet with the optimized hyperparameters obtained a 97% validation 
accuracy. Hence, this three-step optimization process improved the validation accuracy 
by 2 percentage points, which is an important gain considering the need to accurately 
diagnose a COVID-19 positive patient to reduce transmission to uninfected individuals. 


In reference to the data augmentation operations, the results obtained through the 
Bayesian optimization showed that using a vertical flip does not help increase the 
classification accuracy. This is an interesting finding, since vertical flips have shown to 
be useful in other types of medical image classification problems and used as default 
operation when training the networks. Horizontal flips, on the other hand, helped to 
increase accuracy and are highly recommended to be applied in COVID-19 recognition 
tasks. In reference to the rotation operation, it showed to be a successful technique, 
but the angle of rotation should not go beyond 190 degrees as the accuracy starts to 
decrease. Scaling was also found to be a beneficial operation when a maximum of 37 % 
zoom was applied. In general, scaling over 40% can cause some of the regions of interest 
to be missed, hence reducing the recognition capability of the model. 


Finally, the height and width shift did seem to help increase the accuracy but in small 
values. In general, all data augmentation operations should be included during training, 
with exception to the vertical flip, but the magnitude of the operation should be set to 
medium values (as shown in Table 2). If extreme values are used in data augmentation, 
the artificially produced images do not adhere to reality and affect the training process 
of the network. Hence, the validation accuracy decreases instead of increasing. In 
conclusion, the values obtained in this work for the data augmentation operations may 
be highly applicable for other research focused on CXR chest classification where only a 
limited dataset is available. 
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